steppe
2/14/2022
Assigned Reading: Chapter 3 of Spatial Data Science
Pioneer Mountains, Idaho
DE-9IM, By Krauss
In the example in the Upper Middle panel
The dimensions which contain the intersection of the ‘Interior’ of ‘A’ to the ‘Boundary’ of ‘B’ is equal to a line.
\[ dim[I(a)∩B(b)] = 1 \]
\[ \text{∩ is the 'Intersection'} \]
In the example in the Upper Middle panel
The dimensions which contain the intersection of the ‘Interior’ of ‘A’ to the ‘Boundary’ of ‘B’ is equal to a line.
When identifying solutions to a problem we are generally interested in how features affect each other. This schema provides us a framework for organizing our area of analyses, and subsetting the appropriate data.
We are generally interested in :
Spatial Predicates after Egenhofer and Herring (1990)
Disjoint Neither ‘A’ nor ‘B’ touch at any position
Touches The boundaries of ‘A’ and ‘B’ touch
Covers Feature ‘A’ is encapsulated by ‘B’ on many sides, at least a boundary of ‘A’ shares at least a partial border with a border of ‘B’
Contains All borders of one feature are contained by another
Overlap Portions of the interior of feature ‘A’ overlap the interior of ‘B’
Equals ‘A’ and ‘B’ have identical extents.
postGIS implements the following three (additional) tests:
Intersects (includes: ‘st_touches’, ‘st_overlaps’, ‘st_covers’, ‘st_contain’, ‘st_equals’)
Within (includes: ‘st_contains’)
CoveredBy (includes: ‘st_equals’)
st_disjoint(A, B_disjoint, sparse = F): TRUE
st_touches(A, B_touches, sparse = F): TRUE
st_equals(A, B_overlap, sparse = F): TRUE
st_covers(A, B_covers, sparse = F): TRUE
st_contains(A, B_contains, sparse = F): TRUE
st_equals(A, B_equals, sparse = F): TRUE
st_equals(A, B_disjoint, sparse = F): FALSE
st_intersects(A, B_touches, sparse = F): TRUE
st_intersects(A, B_overlap, sparse = F): TRUE
st_intersects(A, B_covers, sparse = F): TRUE
st_intersects(A, B_contains, sparse = F): TRUE
st_intersects(A, B_equals, sparse = F): TRUE
st_within(A, B_contains, sparse = F): FALSE
st_within(B_contains, A, sparse = F): TRUE
st_coveredby(A, B_equals, sparse = F): TRUE
nghbrhd_parks <- st_join(chi_neighb, chi_parks,
join = st_intersects,
left = TRUE
) %>%
count(pri_neigh) This is really a grab bag of what I and others seem to use most the often. We are going to focus on vector data.
files <- paste0('spatial_lecture_data/Chicago_Neighborhoods/' ,
list.files('spatial_lecture_data/Chicago_Neighborhoods', ".shp"))
chi_neighb <- read_sf(files)
rm(files)[1] "sf" "data.frame"
[1] "SpatialPolygonsDataFrame"
attr(,"package")
[1] "sp"
museums.sf <- st_as_sf(museums,
coords = c(x = 'longitude', y = 'latitude'),
crs = 4326,
remove = F
)| attribute | latitude | longitude | geometry |
|---|---|---|---|
| Field Museum | 41.86666 | -87.61643 | POINT (-87.61643 41.86666) |
| Science and Industry | 41.79006 | -87.58227 | POINT (-87.58227 41.79006) |
## Coordinate Reference System:
## User input: EPSG:4326
## wkt:
## GEOGCRS["WGS 84",
## DATUM["World Geodetic System 1984",
## ELLIPSOID["WGS 84",6378137,298.257223563,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## CS[ellipsoidal,2],
## AXIS["geodetic latitude (Lat)",north,
## ORDER[1],
## ANGLEUNIT["degree",0.0174532925199433]],
## AXIS["geodetic longitude (Lon)",east,
## ORDER[2],
## ANGLEUNIT["degree",0.0174532925199433]],
## USAGE[
## SCOPE["Horizontal component of 3D system."],
## AREA["World."],
## BBOX[-90,-180,90,180]],
## ID["EPSG",4326]]
[1] "EPSG:5070"
p1 = st_as_sfc("POLYGON((0 0, 0 10, 10 0, 10 10, 0 0))")
# create a geometry which will not be validst_is_valid(p1): FALSE
st_is_valid(p1, reason = TRUE): Self-intersection[5 5]
If we call class(p1), we can see that our original simple feature collection (sfc) contained: sfc_POLYGON
If we call class(p2), we can see that our valid simple feature collection (sfc) contains: sfc_MULTIPOLYGON
Simple dplyr operations be used to combine the geometries of features
Can be used to combine the geometries of features
| NAME | REGION | geometry |
|---|---|---|
| Alabama | South | MULTIPOLYGON (((-88.20006 3… |
| Arizona | West | MULTIPOLYGON (((-114.7196 3… |
| Colorado | West | MULTIPOLYGON (((-109.0501 4… |
| Connecticut | Norteast | MULTIPOLYGON (((-73.48731 4… |
| Florida | South | MULTIPOLYGON (((-81.81169 2… |
| Georgia | South | MULTIPOLYGON (((-85.60516 3… |
| REGION | geometry |
|---|---|
| Norteast | MULTIPOLYGON (((-70.11867 4… |
| Midwest | MULTIPOLYGON (((-85.48703 4… |
| South | MULTIPOLYGON (((-81.68524 2… |
| West | MULTIPOLYGON (((-118.4887 3… |
Units: [m^2]
[1] 4498293.6 200563.5 3016684.2 972375.8 11596322.7
These data are always returned in units of meters, regardless of the coordinate system specifications. However, different Coordinate Systems will give different results
[1] 2
integer(0)
81852.82 [m^2]
42261063 [m^2]
integer(0)
When we use the ‘which’ logical test, we see that all areas in the results from both sets of area calculations are different. When we plot the values we see that both sets have nearly perfect correlation.
But neither of these values are as accurate as they could be. We want to use an Equal Area projected coordinate system instead.
Here we use the NAD83 Conus Albers, a favorite of the United States Geological Survey (USGS).
equal_areas_proj <- chi_neighb%>%
st_transform(5070) %>%
# We want to use an equal area projection!
st_area()
equal_areas_proj <- sort(equal_areas_proj)While metric values are easy to convert by ‘hand’, we can also convert them using the ‘units’ package.
Units: [km^2]
[1] 0.08197556 0.15334606 0.20086315 0.25781886 0.31268310 0.32418247
|
Smallest
|
Largest
|
||
|---|---|---|---|
| Neighborhood | Area (m^2) | Neighborhood | Area (m^2) |
| Magnificent Mile | 81975.56 | O’Hare | 34545390 |
| Greektown | 153346.06 | South Deering | 28222390 |
| Printers Row | 200863.15 | Englewood | 16127577 |
| Millenium Park | 257818.86 | Austin | 15796991 |
| Boystown | 312683.10 | Hegewisch | 13559961 |
18578.52 [m]
rogers_park <- chi_neighb %>%
filter(pri_neigh == 'Rogers Park')
rogers_park <- st_cast(rogers_park, to = 'MULTILINESTRING')
rogers_park <- st_cast(rogers_park, to = 'LINESTRING')10370.89 [m]
Centroid the “geometric center of mass of a geometry” of an object.
If you just need an arbitrary point on the surface of an object Point on Surface will suffice.
chi_sub <- chi_neighb[sample(size = 30, 1:nrow(chi_neighb)),]
chi_union <- st_union(chi_sub)
chi_cast <- st_cast(chi_union, to = "POLYGON")| Neighborhood | geometry |
|---|---|
| Hegewisch | MULTIPOLYGON (((-87.52462 4… |
| Chicago Lawn | MULTIPOLYGON (((-87.67815 4… |
| South Shore | MULTIPOLYGON (((-87.5921 41… |
| East Village | MULTIPOLYGON (((-87.67705 4… |
| Ukrainian Village | MULTIPOLYGON (((-87.67705 4… |
| Lower West Side | MULTIPOLYGON (((-87.63516 4… |
| geometry |
|---|
| MULTIPOLYGON (((-87.52501 4… |
| geometry |
|---|
| POLYGON ((-87.52501 41.6918… |
| POLYGON ((-87.67705 41.8959… |
| POLYGON ((-87.54612 41.7231… |
| POLYGON ((-87.62761 41.8745… |
| POLYGON ((-87.83658 41.9864… |
| POLYGON ((-87.94001 41.9981… |
chi_buffer_5k <- st_buffer(chi_union, dist = 3000)
chi_ch_by_neigh <- st_convex_hull(chi_neighb)
chi_ch_cit <- st_convex_hull(chi_union)
chi_ch_buf <- st_convex_hull(st_buffer(chi_union, dist = 3000))Buffer
Convex Hull of Chicago Neighborhoods, The City, and a buffer of 3km
Theory: Remember the earth is a ellipsoid, distances must be calculated along the earths curved surface
Great Circle Distance, by: ChecCheDaWaff
| Distance (Km) | Journey |
|---|---|
| 1144.3 | American to Field |
| 334.7 | American to Smithsonian |
| 955.4 | Field to Smithonian |
We can visualize this with some help from Charlie Joey Hadley whom some sf compliant great circle distance code
| Neighborhood | Distance |
|---|---|
| Little Italy, UIC | 10291.97 |
| Lower West Side | 10350.90 |
| West Loop | 10426.27 |
| Greektown | 10455.72 |
| United Center | 10498.78 |
Neighbour list object:
Number of regions: 81
Number of nonzero links: 1780
Percentage nonzero weights: 27.13001
Average number of links: 21.97531
2 regions with no links:
13 43
Link number distribution:
0 4 5 6 7 8 9 15 17 18 20 21 22 23 24 25 26 27 28 29 30 32 33 34
2 1 1 1 4 3 2 2 2 1 5 3 1 2 9 10 8 9 2 3 3 1 4 2
1 least connected region:
81 with 4 links
2 most connected regions:
26 27 with 34 links
Density-Based Spatial Clustering of Applications with Noise (DBSCAN).
Classification algorithm for grouping objects in space
Developed in the mid 90’s, still used and highly cited!
Requires two terms
Ripley’s K
Future Bonus SDS Office Hours Wednesday Night at 5:00 - 6:00 in F-413.
Notes on this Lab My lecture notes are in the R script I used to generate all of the novel figures for this presentation. This script is much longer and more complex than Lecture 1’s script. Likewise this presentation is an .HTML file and can be launched from your computer (it was rendered directly from R using the script).
https://cran.r-project.org/web/packages/gstat/vignettes/gstat.pdf Accessed 1.21.2021
Bivand, R. https://cran.r-project.org/web/packages/spdep/vignettes/nb_sf.html Accessed 1.17.2021
Clementini, Eliseo; Di Felice, Paolino; van Oosterom, Peter (1993). “A small set of formal topological relationships suitable for end-user interaction”. In Abel, David; Ooi, Beng Chin (eds.). Advances in Spatial Databases: Third International Symposium, SSD ’93 Singapore, June 23–25, 1993 Proceedings. Lecture Notes in Computer Science. 692/1993. Springer. pp. 277–295. doi:10.1007/3-540-56869-7_16. ISBN 978-3-540-56869-8. Accessed 1.16.2021
Egenhofer, M.J.; Herring, J.R. (1990). “A Mathematical Framework for the Definition of Topological Relationships”
Ester, Martin; Kriegel, Hans-Peter; Sander, Jörg; Xu, Xiaowei (1996). Simoudis, Evangelos; Han, Jiawei; Fayyad, Usama M. (eds.). A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96). AAAI Press. pp. 226–231. CiteSeerX 10.1.1.121.9220. ISBN 1-57735-004-9.
Hadley, C.J. https://www.findingyourway.io/blog/2018/02/28/2018-02-28_great-circles-with-sf-and-leaflet/ Accessed 1.16.2021
Gimond, M. https://mgimond.github.io/Spatial/point-pattern-analysis-in-r.html Accessed 1.22.2022
Moreno, M., Basille, M. https://r-spatial.org/r/2018/10/25/ggplot2-sf-2.html Accessed 1.22.2022
Dennett, A. https://rstudio-pubs-static.s3.amazonaws.com/126356_ef7961b3ac164cd080982bc743b9777e.html Accessed 1.21.2022
c("raster", "sp", "sf", "tidyverse", "terra", 'spData', 'spdep', 'gstat', 'spatstat', 'fields' ) %>%
map(citation) %>%
print(style = "text", na.print = '')[[1]]
Hijmans R (2022). _raster: Geographic Data Analysis and Modeling_. R
package version 3.5-15, <URL:
https://CRAN.R-project.org/package=raster>.
[[2]]
Pebesma EJ, Bivand RS (2005). "Classes and methods for spatial data in
R." _R News_, *5*(2), 9-13. <URL:
https://CRAN.R-project.org/doc/Rnews/>.
Bivand RS, Pebesma E, Gomez-Rubio V (2013). _Applied spatial data
analysis with R, Second edition_. Springer, NY. <URL:
https://asdar-book.org/>.
[[3]]
Pebesma E (2018). "Simple Features for R: Standardized Support for
Spatial Vector Data." _The R Journal_, *10*(1), 439-446. doi:
10.32614/RJ-2018-009 (URL: https://doi.org/10.32614/RJ-2018-009), <URL:
https://doi.org/10.32614/RJ-2018-009>.
[[4]]
Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R,
Grolemund G, Hayes A, Henry L, Hester J, Kuhn M, Pedersen TL, Miller E,
Bache SM, Müller K, Ooms J, Robinson D, Seidel DP, Spinu V, Takahashi
K, Vaughan D, Wilke C, Woo K, Yutani H (2019). "Welcome to the
tidyverse." _Journal of Open Source Software_, *4*(43), 1686. doi:
10.21105/joss.01686 (URL: https://doi.org/10.21105/joss.01686).
[[5]]
Hijmans R (2022). _terra: Spatial Data Analysis_. R package version
1.5-18, <URL: https://rspatial.org/terra/>.
[[6]]
Bivand R, Nowosad J, Lovelace R (2021). _spData: Datasets for Spatial
Analysis_. R package version 2.0.1, <URL:
https://CRAN.R-project.org/package=spData>.
[[7]]
Bivand R, Wong DWS (2018). "Comparing implementations of global and
local indicators of spatial association." _TEST_, *27*(3), 716-748.
<URL: https://doi.org/10.1007/s11749-018-0599-x>.
Bivand RS, Pebesma E, Gomez-Rubio V (2013). _Applied spatial data
analysis with R, Second edition_. Springer, NY. <URL:
https://asdar-book.org/>.
[[8]]
Pebesma EJ (2004). "Multivariable geostatistics in S: the gstat
package." _Computers & Geosciences_, *30*, 683-691.
Gräler B, Pebesma E, Heuvelink G (2016). "Spatio-Temporal Interpolation
using gstat." _The R Journal_, *8*, 204-218. <URL:
https://journal.r-project.org/archive/2016/RJ-2016-014/index.html>.
[[9]]
Baddeley A, Rubak E, Turner R (2015). _Spatial Point Patterns:
Methodology and Applications with R_. Chapman and Hall/CRC Press,
London. <URL:
https://www.routledge.com/Spatial-Point-Patterns-Methodology-and-Applications-with-R/Baddeley-Rubak-Turner/9781482210200/>.
Baddeley A, Turner R, Mateu J, Bevan A (2013). "Hybrids of Gibbs Point
Process Models and Their Implementation." _Journal of Statistical
Software_, *55*(11), 1-43. doi: 10.18637/jss.v055.i11 (URL:
https://doi.org/10.18637/jss.v055.i11).
Baddeley A, Turner R (2005). "spatstat: An R Package for Analyzing
Spatial Point Patterns." _Journal of Statistical Software_, *12*(6),
1-42. doi: 10.18637/jss.v012.i06 (URL:
https://doi.org/10.18637/jss.v012.i06).
[[10]]
Douglas Nychka, Reinhard Furrer, John Paige, Stephan Sain (2021).
"fields: Tools for spatial data." R package version 13.3, <URL:
https://github.com/dnychka/fieldsRPackage>.